UVid-Net: Enhanced Semantic Segmentation of UAV Aerial Videos by Embedding Temporal Information

نویسندگان

چکیده

Semantic segmentation of aerial videos has been extensively used for decision making in monitoring environmental changes, urban planning, and disaster management. The reliability these support systems is dependent on the accuracy video semantic algorithms. existing CNN-based methods have enhanced image by incorporating an additional module such as LSTM or optical flow computing temporal dynamics which a computational overhead. proposed research work modifies CNN architecture information to improve efficiency segmentation. In this work, encoder-decoder based (UVid-Net) unmanned vehicle (UAV) encoder embeds temporally consistent labeling. decoder introducing feature-refiner module, aids accurate localization class labels. UVid-Net UAV quantitatively evaluated extended ManipalUAVid dataset. performance metric mean Intersection over Union 0.79 observed significantly greater than other state-of-the-art Further, produced promising results even pretrained model street scene fine tuning final layer videos.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reading the Videos: Temporal Labeling for Crowdsourced Time-Sync Videos Based on Semantic Embedding

Recent years have witnessed the boom of online sharing media contents, which raise significant challenges in effective management and retrieval. Though a large amount of efforts have been made, precise retrieval on video shots with certain topics has been largely ignored. At the same time, due to the popularity of novel time-sync comments, or so-called “bullet-screen comments”, video semantics ...

متن کامل

SEMBED: Semantic Embedding of Egocentric Action Videos

We present SEMBED, an approach for embedding an egocentric object interaction video in a semantic-visual graph to estimate the probability distribution over its potential semantic labels. When object interactions are annotated using unbounded choice of verbs, we embrace the wealth and ambiguity of these labels by capturing the semantic relationships as well as the visual similarities over motio...

متن کامل

Semantic Co-segmentation in Videos

Discovering and segmenting objects in videos is a challenging task due to large variations of objects in appearances, deformed shapes and cluttered backgrounds. In this paper, we propose to segment objects and understand their visual semantics from a collection of videos that link to each other, which we refer to as semantic co-segmentation. Without any prior knowledge on videos, we first extra...

متن کامل

Dialogue Session Segmentation by Embedding-Enhanced TextTiling

In human-computer conversation systems, the context of a userissued utterance is particularly important because it provides useful background information of the conversation. However, it is unwise to track all previous utterances in the current session as not all of them are equally important. In this paper, we address the problem of session segmentation. We propose an embedding-enhanced TextTi...

متن کامل

Text Driven Temporal Segmentation of Cricket Videos

In this paper we address the problem of temporal segmentation of videos. We present a multi-modal approach where clues from different information sources are merged to perform the segmentation. Specifically, we segment videos based on textual descriptions or commentaries of the action in the video. Such a parallel information is available for cricket videos, a class of videos where visual featu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing

سال: 2021

ISSN: ['2151-1535', '1939-1404']

DOI: https://doi.org/10.1109/jstars.2021.3069909